Confluence
Import pages and attachments from Confluence into the Knowledge Repository. This connector supports page hierarchy and attachments where allowed by Confluence permissions.
When to use
- Centralizing internal documentation from Confluence into the Knowledge Repository for search, QA, or knowledge ops.
- Migrating documentation or creating an offline/read-only snapshot of a Confluence space.
Notes
- Some Confluence spaces or pages may be restricted; connector access must have the necessary permissions.
- SVAHNAR does not store Confluence files permanently; it ingests (reads) the data directly from your Confluence instance during import.
Usage
- Prepare a Confluence account with appropriate API permissions (read access to the target space and pages).
- Build the
ConfluenceDataconfiguration (below) with the Confluence base URL, token, and the targetspace_key. - Call the import endpoint of the Knowledge Repository connector (or run the connector tool) providing the
ConfluenceDatapayload. - Monitor logs for pages that could not be fetched due to permissions, rate limits, or unsupported content types.
Typical flow
- The connector lists pages in the given
space_keyand walks the page hierarchy. - For each page, it optionally pulls labels, comments, and attachments depending on flags in the configuration.
- The connector normalizes content (optionally preserving newlines) and sends extracted text and metadata into the Knowledge Repository ingestion pipeline.
Parameter reference
-
url: Base URL of your Confluence instance. Examples:https://yourcompany.atlassian.net/wikior an on-prem Confluence likehttps://confluence.internal.local. -
token: Personal access token or API token used to authenticate the connector. Must be kept secret — provide via environment variables or a secure secret manager when possible. -
space_key: The Confluence space key to import (e.g.,ENG,HR,DOCS). The connector will enumerate pages under this space. -
username: The username or service account email associated withtoken. Used for API calls and helpful in logs/audit. -
keep_newlines: Whentrue, the connector preserves original newline characters from Confluence page content. Whenfalse, the connector collapses multiple newlines and normalizes whitespace to produce continuous paragraphs. -
include_labels: Whether to import Confluence page labels/tags as metadata. Labels are useful for filtering later. -
include_comments: Whether to import page comments as separate records or appended notes. Comments can be noisy; enable only if you need discussion history. -
include_archived_content: Iftruethe connector will also include pages that are archived in the space (when Confluence exposes archived state via API). -
include_restricted_content: Iftrue, the connector attempts to import pages that have view restrictions. The connector will only succeed for restricted pages if the authenticated token/user has access. -
include_attachments: Whether to download attachments (pdf, images, docs) referenced by pages. Attachments are fetched when permissions allow and can be stored/ingested according to your Knowledge Repository storage policy.
Example payload
Provide the schema values in your ingestion call. (The connector expects a JSON or equivalent payload matching ConfluenceData.)
{
"url": "https://yourcompany.atlassian.net/wiki",
"token": "<REDACTED>",
"space_key": "DOCS",
"username": "svc-confluence@yourcompany.com",
"keep_newlines": true,
"include_labels": true,
"include_comments": false,
"include_archived_content": false,
"include_restricted_content": false,
"include_attachments": true
}
Authentication & permissions
- Use a service account or API token with read access to the desired space.
- If you need to import restricted pages or attachments, ensure the service account has explicit access to those items.
- If you use Atlassian cloud, prefer using an API token tied to a service account rather than a personal token.
Attachments handling
- When
include_attachmentsis enabled the connector will attempt to download attachments for each page and attach them to the ingestion record (subject to permission and size limits). - Very large attachments may be skipped or truncated depending on repository ingestion limits — check the connector logs for skipped files.
Limitations & caveats
- Rate limiting: Confluence cloud enforces API rate limits. Expect import to throttle and possibly take longer for large spaces.
- Content formats: Some Confluence macros or embedded content may not render to plain text perfectly. The connector strips or attempts to resolve common macros but may leave placeholders for complex macros.
- Attachments and binary files: The connector can fetch attachments but does not convert proprietary file types automatically.
Troubleshooting
- 401/403 errors: Check API token and service account permissions.
- Missing pages: Confirm
space_keyis correct and the token user has read access to those pages. - Rate-limited imports: Retry with exponential backoff or import the space in smaller batches.